Extraction of Career Profiles from Wikipedia

نویسندگان

  • Firas Dib
  • Simon Lindberg
  • Pierre Nugues
چکیده

In this paper, we describe a system that gathers the work experience of a person from her or his Wikipedia page. We first extract an ontology of profession names from the Wikidata graph. We then parse the Wikipedia pages using a dependency parser and we connect persons to professions through the analysis of parts of speech and dependency relations we extract from text. Setting aside the dates, we computed recall and precision scores on a very limited and preliminary test set for which we could reach a recall of 74% and a precision of 95%, showing our approach is promising.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Comparison of Generated Wikipedia Profiles Using Social Labeling and Automatic Keyword Extraction

In many collaborative systems, researchers are interested in creating representative user profiles. In this paper, we are particularly interested in using social labeling and automatic keyword extraction techniques for generating user profiles. Social labeling is a process in which users manually tag other users with keywords. Automatic keyword extraction is a technique that selects the most sa...

متن کامل

Screening of the profiles of the essential oils from the aerial parts of Nepeta racemosa using classical and microwave-based methods: Comparison with the volatiles using headspace solid-phase micro-extraction

Background & Aim:Nepeta racemosa is an herbal and medicinal plant and this report aims to   identify chemical compositions of the essential oils and volatiles of its   aerial parts through classical and advanced methods.  Experimental: Chemical profiles of the essential oils and volatile compounds from the aerial parts of Nepeta racemosa obtai...

متن کامل

Unsupervised Language-Independent Name Translation Mining from Wikipedia Infoboxes

The automatic generation of entity profiles from unstructured text, such as Knowledge Base Population, if applied in a multi-lingual setting, generates the need to align such profiles from multiple languages in an unsupervised manner. This paper describes an unsupervised and language-independent approach to mine name translation pairs from entity profiles, using Wikipedia Infoboxes as a stand-i...

متن کامل

Screening of the profiles of the essential oils from the aerial parts of Nepeta racemosa using classical and microwave-based methods: Comparison with the volatiles using headspace solid-phase micro-extraction

Background & Aim:Nepeta racemosa is an herbal and medicinal plant and this report aims to   identify chemical compositions of the essential oils and volatiles of its   aerial parts through classical and advanced methods.  Experimental: Chemical profiles of the essential oils and volatile compounds from the aerial parts of Nepeta racemosa obtai...

متن کامل

Advertising Keyword Suggestion Using Relevance-Based Language Models from Wikipedia Rich Articles

When emerging technologies such as Search Engine Marketing (SEM) face tasks that require human level intelligence, it is inevitable to use the knowledge repositories to endow the machine with the breadth of knowledge available to humans. Keyword suggestion for search engine advertising is an important problem for sponsored search and SEM that requires a goldmine repository of knowledge. A recen...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015